✨ switch to Huber Regression with L2 normalization for sliding window method#44

Merged

sambra95 merged 6 commits intomainfrom

switch_to_huber_reg

Feb 27, 2026

Collaborator

enryH commented Feb 25, 2026

before (Theil-Sen):

{ 'sliding_window': { 'model_type': 'sliding_window',
                      'params': { 'fit_t_max': 36.6,
                                  'fit_t_min': 35.4,
                                  'intercept': -4.663280857018275,
                                  'slope': 0.0778949612485742,
                                  'time_at_umax': 36.0,
                                  'window_points': 7}}}

now (HuberRegressor):

{ 'sliding_window': { 'model_type': 'sliding_window',
                      'params': { 'fit_t_max': 36.8,
                                  'fit_t_min': 35.6,
                                  'intercept': -4.663248126266753,
                                  'slope': 0.07789418105471918,
                                  'time_at_umax': 36.2,
                                  'window_points': 7}}}

The ladder is roughly twice as fast.

Need to fix in a separate PR a regression in plotting.ipynb tutorial

enryH added 6 commits

February 25, 2026 11:08


          ✨ switch to Huber Regression with L2 normalization

7f170ea


          🔧 add scikit learn as dependency

35ad2b6


          🔥 remove requirements.txt

c0f4e61


          Merge branch 'main' into switch_to_huber_reg

b76a27d


          🔧 switch to specify dependencies in pyproject.toml

25a1d7e


          Merge branch 'main' into switch_to_huber_reg

106387f

enryH requested review from Copilot and sambra95

February 27, 2026 15:25

enryH marked this pull request as ready for review

February 27, 2026 15:25

Copilot started reviewing on behalf of enryH

February 27, 2026 15:25

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

This pull request switches the sliding window method from Theil-Sen regression to HuberRegressor for approximately 2x performance improvement while maintaining similar accuracy. Additionally, it migrates dependency management from requirements.txt to direct specification in pyproject.toml, which is a more modern and streamlined approach.

Changes:

Replaced scipy's theilslopes with scikit-learn's HuberRegressor in the sliding window fitting algorithm
Migrated dependencies from requirements.txt to pyproject.toml's dependencies field
Added scikit-learn as a new dependency

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 6 comments.

File	Description
src/growthcurves/non_parametric.py	Replaced Theil-Sen estimator with HuberRegressor for robust linear regression in sliding window method; added sklearn import
requirements.txt	Removed file entirely as dependencies now specified in pyproject.toml
pyproject.toml	Added direct dependency specification including new scikit-learn dependency; commented out requirements.txt loading

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/growthcurves/non_parametric.py

Comment on lines +94 to +98

+                      # # Use Theil-Sen estimator for robust line fitting
+                      # result = theilslopes(y_log_win, t_win)
+                      # slope, intercept = result.slope, result.intercept
+                      # # Use HuberRegressor which uses L2 regularization and is twice as fast as
+                      # # Theil-Sen.

Copilot AI Feb 27, 2026

The comment has a double hash '# #' which appears to be a formatting issue. This should be cleaned up to use a single '#' for consistency with standard Python comment formatting.

Copilot uses AI. Check for mistakes.

src/growthcurves/non_parametric.py

Comment on lines +94 to +96

+                      # # Use Theil-Sen estimator for robust line fitting
+                      # result = theilslopes(y_log_win, t_win)
+                      # slope, intercept = result.slope, result.intercept

Copilot AI Feb 27, 2026

The commented-out code for Theil-Sen should be removed rather than left in place. Since this is a deliberate switch to HuberRegressor and the PR description indicates it's roughly twice as fast with similar results, the old code should be deleted to keep the codebase clean.

Copilot uses AI. Check for mistakes.

src/growthcurves/non_parametric.py

Comment on lines +97 to +98

		# # Use HuberRegressor which uses L2 regularization and is twice as fast as
		# # Theil-Sen.

Copilot AI Feb 27, 2026

The comment states "uses L2 regularization" but this is inaccurate. HuberRegressor uses L2 regularization as a penalty on the coefficients (controlled by the alpha parameter), but the main feature is that it uses the Huber loss function for robustness to outliers, not L2 regularization. The comment should clarify that HuberRegressor is robust to outliers through the Huber loss function.

Suggested change

      
                    # # Use HuberRegressor which uses L2 regularization and is twice as fast as
          
                    # # Theil-Sen.
          
                    # # Use HuberRegressor, which is robust to outliers via the Huber loss
          
                    # # and is typically faster than Theil-Sen.

Copilot uses AI. Check for mistakes.

src/growthcurves/non_parametric.py

               from logging import getLogger
               import numpy as np
+              import sklearn.linear_model

Copilot AI Feb 27, 2026

The import style import sklearn.linear_model is inconsistent with the rest of the codebase which uses from <package>.<module> import <class> pattern (e.g., from scipy.interpolate import make_smoothing_spline). Consider using from sklearn.linear_model import HuberRegressor for consistency.

Copilot uses AI. Check for mistakes.

src/growthcurves/non_parametric.py

Comment on lines +17 to +19

		# from scipy.stats import theilslopes

Copilot AI Feb 27, 2026

The commented-out import from scipy.stats import theilslopes should be removed entirely as it's no longer needed after switching to HuberRegressor.

Suggested change

# from scipy.stats import theilslopes

Copilot uses AI. Check for mistakes.

pyproject.toml

Comment on lines 25 to +29

               # use requirements.txt instead of pyproject.toml for dependencies
               # https://stackoverflow.com/a/73600610/9684872
               # ! uncomment also dependencies in the dynamic section above
-              [tool.setuptools.dynamic]
-              dependencies = {file = ["requirements.txt"]}
+              # [tool.setuptools.dynamic]
+              # dependencies = {file = ["requirements.txt"]}

Copilot AI Feb 27, 2026

The outdated comments from the previous requirements.txt approach should be removed. These comments (lines 25-27, 28-29) reference the old approach and are now misleading since dependencies are now directly defined in the dependencies list.

Copilot uses AI. Check for mistakes.

sambra95 approved these changes

View reviewed changes

sambra95 merged commit c6c554d into main

11 checks passed

sambra95 deleted the switch_to_huber_reg branch

February 27, 2026 15:46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet